\section{Flow Characteristics}
\label{flow}
\begin{figure}
\centering
\small
\begin{tabular}{cc}
\includegraphics[width=0.22\textwidth]{fig/fig_flow_size.pdf} &
\includegraphics[width=0.22\textwidth]{fig/fig_flow_length.pdf}\\
(a) Flow Size (Bytes)& (b) Flow Duration (us) \\
\includegraphics[width=0.22\textwidth]{fig/fig_act_flow.pdf} &
\includegraphics[width=0.22\textwidth]{fig/fig_inter_time.pdf} \\
(c) Active Flows & (d) Flow Interarrival Time(us)\\
\end{tabular}
\caption{Data Center Flow Distributions}
\label{fig:flows}
\end{figure}
To design Adaptive TCP, we leverage empirical observations of flow
characteristics in a real-world data center. Unsurprisingly, we find
flow distribution characteristics such as size and duration to be
similar to measurements reported in \cite{nature}, \cite{vl2} and \cite{character},
but we repeat them here to provide a basis for the design choices in
ATCP. A key difference is that we analyze the temporal relationship
between flows such as overlap and arrival time interval. 
The observation results imply that
small flows can take bandwidth away from large flows, and large 
flows will get the
necessary compensation only after the completion of small flows.

The data center where we analyze traces has a canonical 2-Tier
architecture, in which Middle-of-Rack switches are used to connect a
row of 5 to 6 racks. Middle-of-Rack switches are connected by
aggregation switches with an over-subscription factor of 2. In total,
there are 500 servers and 22 network devices.  To get the packet 
trace, we randomly selected a handful of locations
and installed sniffers. Our collection spanned 12 hours over multiple
days. According to our investigation and measurement (using the Bro
application identification tool), the applications inside the data
center are mainly web services (HTTP transactions, authentication
services, custom applications) and distributed file
system traffic. 

First we examine the distributions of {\bf flow size}. 
Flow size distribution in Figure~\ref{flow}(a) indicates that 80\% of
the flows are smaller than 100KB in size and most of them are RPC requests and responses.
Especially 99\% of the flows are smaller than 10MB, and the flows between 100KB
and 10MB are mainly web requests and response. 
The total bytes distribution shows most of the bytes
are from the few large flows; especially over 80\% of the bytes are sent
by flows of size larger than 10MB. These flows are rare in amount; 
they are mainly introduced by activities like backup, virtual machine 
migration or large file transfers. 
We use {\bf small}, {\bf medium} and {\bf large} to denote a flow of size 
 [0, 100KB], [100KB, 10MB] and [10MB, $\infty$) respectively and use these
terms in the following text.
\begin{figure}
\centering
\includegraphics[width=0.25\textwidth]{fig/qua.pdf}
\caption{Flow Size and Time Sensitivity}
\label{fig:time_sen}
\end{figure}

By our measurement, we find that most time-sensitive applications like web service
have smaller size, while the large flows are usually not time-sensitive, as is shown in 
Figure~\ref{fig:time_sen}. 

From Figure~\ref{flow}(b) about the {\bf flow duration}, we find that in our data center, 80\%
of the flows are less than 10 seconds long, but there are still some
flows that last for more than 100s. Most of the long-duration flows
are large flows (larger than 1GB). Although the link capacity is
1Gbps, the flow's sending rate is constrained by contention from other flows.

In Figure~\ref{flow}(c), we present the distribution of the number of
{\bf active flows} within a one second bin at one edge switch. In over 90\%
of the time instances, the number of active flows per edge switches is
between 1000 and 2000, and these flows are not uniformly distributed on
each link. On some ``hot spot'' links there are tens of flows, and the
competition between them causes high utilization of the corresponding
link.

Flow {\bf interarrival time} is presented in Figure~\ref{flow}(d). We
observe that 80\% of the flow's interval times were between 400us and
40ms. Given that 20\% of the long-duration flows last longer
than 10s, large flows will coexist with many small flows. 

We also look into the {\bf temporal relationship} between flows.
Most switches maintain queues at in ports and out ports
and all these queues typically share the same memory; so the contention between flows is
not only restricted on links, but also on switch buffers. 
We separate large flows from small and medium flows, 
and calculate the percentage of time
during which large flow exists over the total measurement time, which
is over 95\%. Then we look into each small or medium flow to see whether it is
overlapping with one or more large flows; we find that over 90\% of them 
flows have durations overlap with one or more large flows.

The above measurement and analysis has three implications. First, in a
data center a variety of applications introduce flows of different
properties. Most flows are mice flows, they are small in size, but are majority in quantity.
Elephant flows are small in quantity, but they contribute the majority the total bytes.
 Second, as applications fill in the data center capacity,
flows end up competing which results in bottlenecked links and switch
buffers; most of these resources (link capacity and buffers) are taken by large flows. 
Third, most small flows and large flows coexist and compete with
each other in the network; with large flows taking most of the network 
resources, small flows suffer.
If small flows ``borrow'' some bandwidth from large flows, they would
complete more quickly; while large flows can get time compensation after the
small flows complete. The improvements to small flows can enhance a
variety of different applications.
